COS
Co-Operating (Operating) Systems A COS (Cooperating Operating System or Co-Operating System for short) is an operating system where the functionality of the Operating System is spread across many small cooperating OS processors. A COS is the first attempt at building a platform for GREAT Computing. Many influences have been drawn upon in this work. Around 1987 Jason Cozens worked with Ivor Catt at wikipedia:Sinclair Research. The work involved looking at ways to program the Catt Spiral. Some of the things looked at were the use of CSP (Hoare, 1985) to implement Fast Fourier Transforms. The Catt Spiral was a way of addressing the problem of flaws in silcon wafers. A major factor limiting the size of silicon chips is the fact that the probability of a flaw on a silicon chip increases with square of the width of the chip (assuming a square chip). Therefore doubling the width of a chip increases the probability of the chip failing by a factor of 4. This makes the chance of using a whole wafer with today's manufaturing techniques very small. Catt split the wafer into cells. A cell would be selected on the edge of the wafer and this would probe its neighbouring cells to find one that was working. When a working cell was found this cell would then probe its neighbours. In this way bad cells would be isolated and a chain of good cells would be established. The major problem in programming the Catt Spiral was the limitations of the architecture and the I/O bottleneck. Catt later (Circa 1990) went on to propose The Kernel Logic Machine. This extended the spiral idea to context up a mesh network of processors. There seems to have been limited interest in this at the time. Also during the late 1980s Jason Cozens was researching into Adaptable Processor Systems. This work was influenced by SUN's wikipedia:NeWS and looked at devloping formal models of such systems. As part of this work a claculus of higher order processes was developed but this was too close to Bent Thomsens CHOCS (Thomsen, 1989) to be published. A number of fruitful discussions and meetings were had with Bent Thomsen. Eager Queue Protocols (to be discussed below) were started around 2004 (See OpenEd-Lab4 for some early ideas) as a way of monitoring distributed processes as part of the Child Trust Fund project. Context Application Area - General Purpose Computing The design presented here is targetted at general purpose computing. We are looking for an architecture that provides reasonable performance across many problem domains. The architecture is not aimed specifically at HTC (High Throughput Computing) and in particular is not aimed at computing where the architecture can be optimised before computing starts. Process networks in general purpose computing tend to lack regular structure. It is the intention that a COS can easily configure process networks with irregular structures. MOVE The main service of an operating system is process management. Process management consists of: * Creating Processes * Scheduling Processes * Destroying Processes Assumptions To simplify the initial development some assumptions have been made. The main assumption that is being made is that there is a large number of processors. This largely removes the need for a scheduler. The assumption is not totally unfounded. In the late 1980s Ivor Catt proposed a Kernel machine with 1,000,000 cores. * Large number of processors ** Catt Kernel http://www.patentstorm.us/patents/5055774.html ** Intel's Single Chip Cloud Computer Design Objectives Reliability and energy efficiency are becoming dominant constraints in the design of computing systems. These are the primary design objectives for a COS. Traditionally the most expensive part of a computer has been the central processing unit. The original computers were expensive from a manufacturing perspective. Modern microprocessors are expensive from an energy perspective. Linking Text In the design of a Co-Operating System we want to aim at satisfying Tanebaum's 3 principles (Tanenbaum, 2001, pp 859-861): * Principle 1: Simplicity * Principle 2: Completeness * Principle 3: Efficiency Development of a COS If we start by looking at the objectives for a platform for GREAT computing we have the following list: * Guaranteed * Reliable * Efficient * Affordable * Testable Guaranteed Reliable If we return to the hypothesis for this work, it states: We want an architecture where we can prove our systems correct. It is generally accepted that to enable proofs to be feasible a system must be component based. When working bottom-up we want the PQ cells to be the basic component that provides a foundation. In Can We Make Operating Systems Reliable and Secure? (Tanenbaum et al., 2006) the authors consider 4 attempts at improving the reliability and security of operating systems: # Armored Operating Systems # Paravitual machines # Microkernels # Singularity Approaches 1 and 2 are intended to improve the reliability of existing (legacy) systems. 3 and 4 replace legacy operating systems with more reliable and secure ones. The multiserver approach runs each driver and operating system component in a separate user process nd allows them to communicate using the microkernel's IPC mechanism. Finally, Singularity, the most radical approach, uses a type-safe language, a single address space, and formal contracts to carefully limit what each module can do. A COS makes the basic component a process and isolates the process in hardware. Some of the ideas here come form the Singularity project (Hunt, Larus, 2007). In singularity processes run as software isolated processes. In a COS processes run as hardware isolated processes. Much of the work in singularity seems to reflect work based on the transputer. The main difference would appear to be that singularity is a more general purpose OS. Guarantees and Reliability - Need Reworking It is not easy to guarantee systems which are not reliable. Therefore the first objective for a COS is reliability. If we can build reliable systems we can guarantee them and meet the GR objectives. Refactoring the Microkernel The following diagram is from (Tanenbaum, 2001): Good system design consists of separate concerns that can be combined independently. (Tanenbaum, 2001, pp 871) In the design of a Co-Operating System we want to try and treat IPC, Process Management and Scheduling as orthogonal concepts. We start by separating process management and IPC using an IPC network. Later in the design we will broaden process management to resource management. * Transputer * Singularity * Microkernel * Law of Demeter * Catt Spiral/Kernel * CSK * Per Brinch Hansen QCells Efficient Control of Number of QCells with Fixed Reference Control of Number of QCells with Adaptive Reference Affordable Testable Key Points # COS is aimed at general purpose computing. # Separates the Operating System from the Applications # Decentralised # Aims to run with Just Enough Power # Fast Boot as each process boots in parallel # Processes run on isolated Processors (cores) # Simple Cell Architecture - PQ Cells Relation to Microkernels A COS is very similar to a Microkernel in that it tries to implement the minimal functionality. wikipedia:Microkernel#Essential_components_and_minimality Microkernel - Essential Components and Minimality It states as a microkernel must allow building arbitrary operating system services on top, it must provide some core functionality. At the least this includes: # some mechanisms for dealing with address spaces — this is required for managing memory protection; # some execution abstraction to manage CPU allocation — typically threads or scheduler activations; and # inter-process communication — required to invoke servers running in their own address spaces. This minimal design was pioneered by Brinch Hansen's Nucleus (Brinch Hansen, 1970) and the hypervisor of IBM's VM. A COS satisfies these as follows: # Processes run on separate processors, this provides the first part of memory protection. # The execution abstraction is parallel processors. # This is an open question at the moment. There are several possible approaches. Process Granularity A COS Architecture Difficulties in Implementing a COS See Also Operating System Research Category:Research